Adapting hierarchical clustering distance measures for improved presentation of relationships between transaction elements

نویسندگان

  • Mihaela Vranić
  • Damir Pintar
  • Dragan Gamberger
چکیده

Common goal of descriptive data mining techniques is presenting new information in concise, easily interpretable and understandable ways. Hierarchical clustering technique for example enables simple visualization of distances between analyzed objects or attributes. However, common distance measures used by existing data mining tools are usually not well suited for analyzing transactional data using this particular technique. Including new types of measures specifically aimed at transactional data can make hierarchical clustering a much more feasible choice for transactional data analysis. This paper presents and analyzes convenient measure types, providing methods of transforming them to represent distances between transaction elements more appropriately. Developed measures are implemented, verified and compared in hierarchical clustering analysis on both artificial data as well as referent transactional datasets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comprehensive Survey on Distance / Similarity Measures between Probability Density Functions

Distance or similarity measures are essential to solve many pattern recognition problems such as classification, clustering, and retrieval problems. Various distance/similarity measures that are applicable to compare two probability density functions, pdf in short, are reviewed and categorized in both syntactic and semantic relationships. A correlation coefficient and a hierarchical clustering ...

متن کامل

An Empirical Comparison of Distance Measures for Multivariate Time Series Clustering

Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...

متن کامل

Taxonomy of Nominal Type Histogram Distance Measures

Abstract: Distance or similarity measures are of fundamental importance to pattern classification, clustering, and information retrieval problems. Various distance/similarity measures that are applicable to compare two nominal type histograms are reviewed and categorized in both syntactic and semantic relationships. A correlation coefficient and a hierarchical clustering technique are adopted t...

متن کامل

Assessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories

In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...

متن کامل

New distance and similarity measures for hesitant fuzzy soft sets

The hesitant fuzzy soft set (HFSS), as a combination of hesitant fuzzy and soft sets, is regarded as a useful tool for dealing with the uncertainty and ambiguity of real-world problems. In HFSSs, each element is defined in terms of several parameters with arbitrary membership degrees. In addition, distance and similarity measures are considered as the important tools in different areas such as ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012